Feature Selection Ensemble
نویسندگان
چکیده
Many strategies have been exploited for the task of feature selection, in an effort to identify more compact and better quality feature subsets. Such techniques typically involve the use of an individual feature significance evaluation, or a measurement of feature subset consistency, that work together with a search algorithm in order to determine a quality subset. Feature selection ensemble aims to combine the outputs of multiple feature selectors, thereby producing a more robust result for the subsequent classifier learning tasks. In this paper, three novel implementations of the feature selection ensemble concept are introduced, generalising the ensemble approach so that it can be used in conjunction with many subset evaluation techniques, and search algorithms. A recently developed heuristic algorithm: harmony search is employed to demonstrate the approaches. Results of experimental comparative studies are reported in order to highlight the benefits of the present work. The paper ends with a proposal to extend the application of feature selection ensemble to aiding the development of biped robots (inspired by the authors’ involvement in the joint celebration of Olympic and the centenary of the birth of Alan Turing).
منابع مشابه
Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کاملسودمندی رگرسیونهای تجمیعی و روشهای انتخاب متغیرهای پیشبین بهینه در پیشبینی بازده سهام
مقاله حاضر به بررسی سودمندی رگرسیونهای تجمیعی و روشهای انتخاب متغیرهای پیشبین بهینه (شامل روش مبتنی بر همبستگی و ریلیف) برای پیشبینی بازده سهام شرکتهای پذیرفته شده در بورس اوراق بهادار تهران میپردازد. بهمنظور ارزیابی عملکرد رگرسیون تجمیعی، معیارهای ارزیابی (شامل میانگین قدرمطلق درصد خطا، مجذور مربع میانگین خطا و ضریب تعیین) مربوط به پیشبینی این روش، با رگرسیون خطی و شبکههای عصبی مصنوعی...
متن کاملMLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection
Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...
متن کاملThe prediction of lymphedema via the combination of the selected data mining algorithms
Background: Breast cancer is the second leading cause of cancer death in women, after lung cancer. Due to the importance of predicting this disease, the use of data mining methods in medical research is more significant than before. Data mining algorithms can be a great help in preventing the development of lymphedema in patients. The aim Of this study was to create a diagnosis system that can ...
متن کاملEnhanced Classification Accuracy for Cardiotocogram Data with Ensemble Feature Selection and Classifier Ensemble
In this paper ensemble learning based feature selection and classifier ensemble model is proposed to improve classification accuracy. The hypothesis is that good feature sets contain features that are highly correlated with the class from ensemble feature selection to SVM ensembles which can be achieved on the performance of classification accuracy. The proposed approach consists of two phases:...
متن کاملCombining Feature Selection and Ensemble Learning for Software Quality Estimation
High dimensionality is a major problem that affects the quality of training datasets and therefore classification models. Feature selection is frequently used to deal with this problem. The goal of feature selection is to choose the most relevant and important attributes from the raw dataset. Another major challenge to building effective classification models from binary datasets is class imbal...
متن کامل